2025-11-14T06:07:11.042630

Asynchronous Event-Inertial Odometry using a Unified Gaussian Process Regression Framework

Li, Wang, Liu et al.
Recent works have combined monocular event camera and inertial measurement unit to estimate the $SE(3)$ trajectory. However, the asynchronicity of event cameras brings a great challenge to conventional fusion algorithms. In this paper, we present an asynchronous event-inertial odometry under a unified Gaussian Process (GP) regression framework to naturally fuse asynchronous data associations and inertial measurements. A GP latent variable model is leveraged to build data-driven motion prior and acquire the analytical integration capacity. Then, asynchronous event-based feature associations and integral pseudo measurements are tightly coupled using the same GP framework. Subsequently, this fusion estimation problem is solved by underlying factor graph in a sliding-window manner. With consideration of sparsity, those historical states are marginalized orderly. A twin system is also designed for comparison, where the traditional inertial preintegration scheme is embedded in the GP-based framework to replace the GP latent variable model. Evaluations on public event-inertial datasets demonstrate the validity of both systems. Comparison experiments show competitive precision compared to the state-of-the-art synchronous scheme.
academic

Asynchronous Event-Inertial Odometry using a Unified Gaussian Process Regression Framework

Basic Information

  • Paper ID: 2412.03136
  • Title: Asynchronous Event-Inertial Odometry using a Unified Gaussian Process Regression Framework
  • Authors: Xudong Li, Zhixiang Wang, Zihao Liu, Yizhai Zhang, Fan Zhang, Xiuming Yao, Panfeng Huang
  • Category: cs.RO (Robotics)
  • Publication Date: December 4, 2024 (arXiv preprint)
  • Paper Link: https://arxiv.org/abs/2412.03136

Abstract

This paper proposes an asynchronous event-inertial odometry method based on a unified Gaussian Process (GP) regression framework for naturally fusing asynchronous data association and inertial measurements. The method leverages GP latent variable models to construct data-driven motion priors and obtain analytical integration capabilities, then tightly couples asynchronous event feature association and integrated pseudo-measurements within the same GP framework. The fusion estimation problem is solved through a sliding window factor graph with ordered marginalization of historical states considering sparsity. The authors also design a comparative system that embeds traditional inertial preintegration schemes into the GP framework. Evaluation on public event-inertial datasets demonstrates the effectiveness of both systems, with comparative experiments showing accuracy comparable to state-of-the-art synchronous methods.

Research Background and Motivation

Problem Definition

Event cameras, as bio-inspired visual sensors with asynchronous triggering mechanisms, independently record light intensity changes at each pixel level. This unique mechanism provides event cameras with significant advantages over traditional cameras: low power consumption, low latency, high dynamic range, and high temporal resolution. However, the asynchronous nature of event cameras presents substantial challenges for traditional fusion algorithms.

Limitations of Existing Methods

  1. Frame-based discrete-time schemes: Accumulate events into fixed temporal windows, losing temporal diversity of events, resulting in motion blur and requiring additional deblurring operations
  2. Traditional IMU preintegration: Applied within discrete-time frameworks, discarding numerous inter-frame temporal measurements and losing fine-grained motion information
  3. Computational efficiency: Existing GP methods typically employ full smoothing backends with high computational costs

Research Motivation

To fully exploit the high temporal resolution characteristics of event cameras, there is an urgent need to introduce new methods for fusing asynchronous and high temporal resolution event-inertial observations. This paper focuses on the asynchronous measurement fusion problem and proposes a solution based on a unified GP framework.

Core Contributions

  1. Unified GP Framework: Proposes a unified Gaussian Process regression framework capable of naturally handling fusion of asynchronous event feature association and inertial measurements
  2. GP Latent Variable Model: Introduces latent variable models into the GP regression framework for analytical integration of inertial measurements and implicitly inducing data-driven GP priors
  3. Dual System Design: Implements comparative systems for two fusion approaches:
    • CT-IMU: Sparse GP prior + IMU preintegration
    • GP-IMU: GP regression preintegration
  4. Efficient Sliding Window: Employs sliding window factor graph optimization with marginalization strategies to maintain computational efficiency
  5. Fully Asynchronous Processing: Uses EKLT for event-driven feature detection and tracking, preserving the high temporal resolution characteristics of event cameras

Methodology Details

Task Definition

Input: Asynchronous event stream and IMU measurement data Output: SE(3) trajectory estimation (including position, orientation, and velocity) Constraints: Handle asynchronous data association while maintaining computational efficiency

Model Architecture

1. Sparse GP Prior

Employs white noise acceleration (WNOA) motion prior for SE(3) modeling:

Ṫwb(t) = Twb(t)ϖbwb(t)∧
ϖ̇bwb(t) = w(t), w(t) ∼ GP(0,Qcδ(t-t'))

where ϖbwb(t) is the velocity in body coordinates and w(t) is a generalized acceleration vector modeled as zero-mean white noise GP.

2. GP Regression Preintegration

Models relative acceleration and rotational velocity as independent GPs:

ṙbkb(t) ∼ GP(0,kr(t,t'))
abbk(t) ∼ GP(0,ka(t,t'))

Obtains noisy observations of GP through latent states ρ̂ and α̂, then leverages GP inference capabilities to compute preintegrated velocity, position, and rotation increments.

3. System Architecture

The entire system comprises two parallel threads:

  • Asynchronous feature tracking frontend: Uses EKLT for event-driven feature detection and tracking
  • GP-based sliding window backend: Handles feature management, triangulation, and factor graph optimization

Technical Innovations

1. Unified Framework Design

Both methods operate within the same GP framework but handle IMU data differently:

  • CT-IMU: Queries states on continuous-time trajectory, separately fuses IMU measurements
  • GP-IMU: Relies on IMU measurements for state inference, reducing trajectory prior constraints

2. Interpolation Projection Factor

Obtains pose Twbτ at measurement time tτ through GP interpolation, with visual residual error defined as:

rV(Twbτ, li, ẑi) = ẑi - (1/di)K(TwbτTbτcτ)T li

3. Sliding Window Optimization

Employs dynamic marginalization strategy, prioritizing marginalization of newest states and related landmarks to maintain sparsity of the Hessian matrix.

Experimental Setup

Datasets

  • DAVIS Dataset: Records aggressive motion data using DAVIS240C (240×180) across multiple scenes
  • MVSEC Dataset: Uses left event camera data (DAVIS 346B, 346×260)

Evaluation Metrics

  • RMS RTE: Root mean square relative trajectory error for accuracy assessment
  • Computation Time: Average time consumption of each module
  • Factor Graph Scale: Complexity indicator of optimization problem

Comparison Methods

  • Vidal et al. 3 (E+I configuration)
  • Guan & Lu 4 event-inertial method
  • Internal comparison of two proposed methods

Implementation Details

  • Parallax threshold: 8 pixels
  • Minimum feature track length: 4
  • GP-IMU latent states: 400
  • Sliding window minimum size: 40
  • State temporal interval: 0.05 seconds

Experimental Results

Main Results

SequenceCT-IMUGP-IMURef.4Ref.3
dynamic translation0.0300.0600.0560.037
dynamic 6dof0.0760.0560.0730.040
poster translation0.0870.0820.2420.087
poster 6dof0.1560.0840.2100.197
boxes 6dof0.3470.1510.0730.078
shapes 6dof0.1080.244---0.163

Performance Analysis

  1. Accuracy Performance: Both methods demonstrate accuracy comparable to discrete optimization methods on most sequences, with superior performance on certain sequences
  2. Computational Efficiency: GP-IMU typically exhibits lower computational costs due to fewer variables
  3. Robustness: GP-IMU is more sensitive to IMU noise because it relies on IMU-driven GP for constructing visual residuals

Time Consumption Analysis

MethodFrontendOptimizationMarginalizationIMU PreintegrationOther
CT-IMU(s)1273.97247.8343.9510.1770.743
GP-IMU(s)1274.51182.0544.9144.7130.693

The EKLT tracker consumes approximately 80% of total time, being the most time-consuming component. GP-IMU is faster in graph optimization but slightly slower in IMU preintegration.

Event-Inertial Odometry Classification

  1. Frame-based discrete-time schemes: Inherit traditional frame camera algorithms, performing data association on event accumulation
  2. Event-driven continuous-time methods: Directly process event streams, employing continuous-time backends

Gaussian Process Applications in Robotics

GP continuous-time representation methods were early applied to trajectory inference for scanning LiDAR and asynchronous sensors. Recent research applies GP to monocular event visual odometry systems, though with high computational costs.

Conclusions and Discussion

Main Conclusions

  1. Both proposed GP methods effectively handle asynchronous event-inertial fusion problems
  2. GP-IMU achieves higher accuracy on most sequences but is more sensitive to IMU noise
  3. The sliding window strategy effectively controls computational complexity
  4. The method demonstrates competitive performance in complex motion scenarios

Limitations

  1. Real-time Performance: The system cannot currently run in real-time due to retaining all frontend asynchronous measurements for optimization
  2. Insufficient Robustness: Lacks outlier rejection or motion compensation mechanisms
  3. IMU Quality Dependency: GP-IMU method requires high-quality IMU data
  4. Aggressive Motion Constraints: Both methods may be affected by rapid acceleration changes

Future Directions

  1. Information-theoretic graph sparsification strategies for real-time performance
  2. Improved frontend to enhance system robustness
  3. Algorithm optimization for low-quality IMU
  4. Extension to more complex motion patterns

In-Depth Evaluation

Strengths

  1. Theoretical Innovation: The unified GP framework elegantly addresses asynchronous fusion with solid theoretical foundations
  2. Systematic Research: Dual system design provides comprehensive comparative analysis
  3. Sufficient Experiments: Thorough evaluation on multiple public datasets
  4. Engineering Implementation: GTSAM-based implementation ensures reproducibility

Weaknesses

  1. Real-time Limitations: Current inability to meet real-time application requirements limits practical value
  2. Frontend Dependency: Over-reliance on EKLT frontend with insufficient handling of exceptional cases
  3. Limited Applicability: Certain constraints on IMU quality and motion types
  4. Insufficient Theoretical Analysis: Lacks in-depth analysis of theoretical differences between the two methods

Impact

  1. Academic Value: Provides new theoretical framework for event camera and inertial fusion
  2. Practical Potential: After addressing real-time issues, promising applications in robot navigation
  3. Extensibility: Framework demonstrates good extensibility for other sensor fusion scenarios

Applicable Scenarios

  1. High-Dynamic Environments: Suitable for high-speed motion scenarios where traditional cameras struggle
  2. Sufficient Computational Resources: Appropriate for applications with high accuracy requirements and relatively abundant computational resources
  3. Research Platforms: Provides valuable benchmark methods for event camera research

References

This paper cites 26 relevant references covering important works in event camera surveys, IMU preintegration, continuous-time estimation, Gaussian process regression, and other key domains, with comprehensive and authoritative citations.


Overall Assessment: This is an innovative work in the event-inertial odometry field that proposes a unified GP framework providing new insights for handling asynchronous sensor fusion. Despite limitations such as real-time performance, it makes significant theoretical contributions with sufficient experimental evaluation, laying a solid foundation for subsequent research in this domain.