Biography

I am currently a Research Assistant at Westlake University by Prof. Peidong Liu. Previously, I obtained a B.Eng. in Information Engineering from Guangdong University of Technology in 2024. In summer 2023, I visited the University of Cambridge to study deep learning and computer vision. Under the supervision of Prof. Wei Meng, I studied robotics and SLAM, gaining extensive hands-on experience in debugging and deploying real robotic systems.

My research interest lies in spatial intelligence, 3D/4D vision, multimodal learning, and world models. More specifically, I aim to explore how AI can learn robust and structured spatial information from visual observations, so that they can better represent geometry, semantics, dynamics, and affordances in the physical world. Ultimately, I hope these spatial representations can serve as a foundation for downstream VLMs, VLAs, and world models, enabling more effective reasoning, planning, and action for robotic system in real-world environments.

Publications

* denotes equal contribution or advising; † denotes corresponding author

E-MoFlow: Learning Egomotion and Optical Flow from Event Data via Implicit Regularization

NeurIPS 2025

Project page Paper Code

Wenpu Li*, Bangyan Liao*, Yi Zhou, Qi Xu, Pian Wan, Peidong Liu†

E-MoFlow jointly learns 6-DoF egomotion and optical flow from events in a fully unsupervised paradigm, without explicit depth estimation.

SIU3R: Simultaneous Scene Understanding and 3D Reconstruction Beyond Feature Alignment

NeurIPS 2025 Spotlight

Project page Paper Code

Qi Xu*, Dongxu Wei*†, Lingzhe Zhao, Wenpu Li, Zhangchi Huang, Shunping Ji†, Peidong Liu†

SIU3R is a feed-forward framework for simultaneous scene understanding and reconstruction from unposed images, unifying reconstruction with semantic, instance, panoptic, and text-referred segmentation.

Overview

Casual3DHDR: Deblurring High Dynamic Range 3D Gaussian Splatting from Casually Captured Videos

ACM MM 2025

Project page Paper Code

Shucheng Gong*, Lingzhe Zhao*, Wenpu Li*, Hong Xie†, Yin Zhang, Shiyu Zhao, Peidong Liu†

Casual3DHDR reconstructs sharp HDR Gaussians from videos, jointly optimizing exposure time, CRF, camera motion, and the HDR scene.

NVS & HDR EDIT

BeNeRF: Neural Radiance Fields from a Single Blurry Image and Event Stream

ECCV 2024

Project page Paper Code

Wenpu Li*, Pian Wan*, Peng Wang*, Jinghang Li, Yi Zhou, Peidong Liu†

From a single blurry image and event stream, BeNeRF recovers neural radiance fields and camera motion, then decodes the scene into a clear, vivid novel-view video stream.

Lego · Real world

Blurry input

Decoded stream

Robotics Experience

Humanoid Policy Reproduction and Real-Robot Deployment

Xiang Liu*, Wenpu Li*, Lingzhe Zhao*

Reproduced GMT and UH-1 across motion retargeting, policy inference, and real-robot deployment pipelines, then deployed policies on the Unitree G-1 to gain hands-on experience with sim-to-real transfer, whole-body control, and text-to-motion control.

Wheeled Robot Equipped with an Arm for Automatic Storage

Qingrui Zhu*, Wenpu Li*, Wenbin Zheng, Yong Zhang, Junyao Li

Built a wheeled robot with an onboard arm for automated warehouse sorting. The system integrated STM32, Raspberry Pi, and OpenMV hardware, implemented forward and inverse kinematics for arm control, designed communication and motion control for the mobile chassis, and used visual recognition algorithms.

Self-Localization UAV Leveraging Visual Odometry

Wenpu Li*, Guohua Zhang*

Deployed RealSense T265 and ZED cameras for UAV visual odometry, and evaluated ORB-SLAM2, Stereo-DSO, and other SLAM algorithms in real-world flight scenarios.

Open-Source Project

LMMs-Eval: Probing Intelligence in the Real World

A unified evaluation toolkit for large multimodal models across text, image, video, and audio tasks, with an emphasis on reproducible, efficient, and trustworthy evaluation.

Added the VSI-SUPER benchmark and Cambrian-S model support for spatial and video understanding tasks. #1267 #1268
Resolved Qwen3-VL and Qwen2.5-VL video bugs affecting timestamp alignment and temporal position encoding. #1261 #1269 #1260 #1244
Fixed bugs across vLLM, InternVL, OCRBench, and VSI-Bench to improve inference backend stability and benchmark correctness.

Languages

Chinese: Native Language
Japanese: Native Language — I was born in Japan and lived there for eight years.
English: Intermediate Level
Cantonese: Entry Level — I became deeply fond of the language while studying in Guangzhou.

Awards

National Bronze Medal in Automatic Storage/Retrieval System at Robocup China open, 2021
National Second Prize in Automatic Storage/Retrieval System at Robocup China open, 2022
National Second Prize in iCAN Innovation Contest, 2022
National Third Prize in Blue Bridge Cup, 2022
First Prize in Contemporary Undergraduate Mathematical Contest in Modeling(Guangdong Division), 2022
Third Prize in National Undergraduate Electronics Design Contest(Guangdong Division), 2021

Academic Services

Journal Reviewer: T-PAMI
Conference Reviewer: CVPR, NeurIPS, ECCV, IROS

Acknowledgements

Thanks to Wenyi Zhang for taking the portrait for my homepage.