E-MoFlow

NeurIPS 2025

¹Westlake University ²Zhejiang University
³Hunan University ⁴Wuhan University ⁵Georgia Institute of Technology

Abstract

TL;DR: Reavel that implicit regularizations can enable the mutual promotion of learning for optical flow and egomotion under a fully unsupervised paradigm.
Keywords: Self-Supervised Learning, 3D Vision, Optical Flow Estimation, Motion Estimation, Event Camera

The estimation of optical flow and 6-DoF ego-motion—two fundamental tasks in 3-D vision—has typically been addressed independently. For neuromorphic vision (e.g., event cameras), however, the lack of robust data association makes solving the two problems separately an ill-posed challenge, especially in the absence of supervision via ground truth.

Existing works mitigate this ill-posedness by either enforcing the smoothness of the flow field via an explicit variational regularizer or leveraging explicit structure-and-motion priors in the parametrization to improve event alignment. The former notably introduces bias in results and computational overhead, while the latter—which parametrizes the optical flow in terms of the scene depth and the camera motion—often converges to suboptimal local minima.

To address these issues, we propose an unsupervised pipeline that jointly optimizes egomotion and flow via implicit spatial-temporal and geometric regularization. First, by modeling camera's egomotion as a continuous spline and optical flow as an implicit neural representation, our method inherently embeds spatial-temporal coherence through inductive biases. Second, we incorporate structure-and-motion priors through differential geometric constraints, bypassing explicit depth estimation while maintaining rigorous geometric consistency. As a result, our framework E-MoFlow unifies egomotion and optical flow estimation via implicit regularization under a fully unsupervised paradigm.

Experiments demonstrate its versatility to general 6-DoF motion scenarios, achieving state-of-the-art performance among unsupervised methods and competitive even with supervised approaches.

Pipeline

Given the input event data, we use differential flow loss and differential geometric loss to train the neural network and spline parameters. By jointly optimizing these two losses, we can avoid getting stuck in local minima, making it possible to solve the ill-posed problem through self-supervised learning. Solid arrows denote the forward process; dashed arrows denote gradient backpropagation.

Results

We performed a qualitative comparison with strong baselines on the MVSEC dataset. Note that we visualize optical flow only at pixels where events are triggered.

We also conducted qualitative evaluation on the 6-DoF motion estimation. The estimated motion by our method closely matches the ground truth.

Futheremore, We conduct comprehensive quantitative benchmarking for optical flow estimation, comparing our method against four distinct paradigms: Supervised Learning (SL), Self-Supervised Learning (SSL), Unsupervised Learning (USL), and Model-Based(MB) approaches. Our method achieves state-of-the-art performance in dt = 1 settings while attaining the second-highest accuracy in dt = 4 conditions.

We performed motion estimation benchmarking on the MVSEC dataset as well, and the results demonstrate that our unsupervised learning method reaches SOTA.

Citation

@inproceedings{li2025emoflow,
      author = {Wenpu Li and Bangyan Liao and Yi Zhou and Qi Xu and Pian Wan and Peidong Liu},
      title = {E-MoFlow: Learning Egomotion and Optical Flow from Event Data via Implicit Regularization},
      booktitle = {Annual Conference on Neural Information Processing Systems (NeurIPS)},
      year = {2025}
  }

🥺 E-MoFlow

Learning Egomotion and Optical Flow from
Event Data via Implicit Regularization

NeurIPS 2025

Abstract

Demo

MVSEC

DSEC

Pipeline

Results

Citation