Picture for Tiejun Huang

Tiejun Huang

Emu3.5: Native Multimodal Models are World Learners

Add code
Oct 30, 2025
Viaarxiv icon

$π_\texttt{RL}$: Online RL Fine-tuning for Flow-based Vision-Language-Action Models

Add code
Oct 29, 2025
Viaarxiv icon

RoboBrain 2.0 Technical Report

Add code
Jul 02, 2025
Viaarxiv icon

OmniGen2: Exploration to Advanced Multimodal Generation

Add code
Jun 23, 2025
Viaarxiv icon

SpikePingpong: High-Frequency Spike Vision-based Robot Learning for Precise Striking in Table Tennis Game

Add code
Jun 07, 2025
Viaarxiv icon

RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics

Add code
Jun 04, 2025
Viaarxiv icon

SpikeStereoNet: A Brain-Inspired Framework for Stereo Depth Estimation from Spike Streams

Add code
May 26, 2025
Viaarxiv icon

SpikeGen: Generative Framework for Visual Spike Stream Processing

Add code
May 23, 2025
Viaarxiv icon

SPKLIP: Aligning Spike Video Streams with Natural Language

Add code
May 19, 2025
Viaarxiv icon

SOTA: Spike-Navigated Optimal TrAnsport Saliency Region Detection in Composite-bias Videos

Add code
May 01, 2025
Viaarxiv icon