Picture for Tai Wang

Tai Wang

VFlowOpt: A Token Pruning Framework for LMMs with Visual Information Flow-Guided Optimization

Add code
Aug 07, 2025
Viaarxiv icon

InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation

Add code
Jul 23, 2025
Viaarxiv icon

Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities

Add code
Jul 17, 2025
Viaarxiv icon

OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding

Add code
Jul 10, 2025
Viaarxiv icon

CronusVLA: Transferring Latent Motion Across Time for Multi-Frame Prediction in Manipulation

Add code
Jun 24, 2025
Viaarxiv icon

GENMANIP: LLM-driven Simulation for Generalizable Instruction-Following Manipulation

Add code
Jun 12, 2025
Viaarxiv icon

MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence

Add code
May 29, 2025
Viaarxiv icon

LabUtopia: High-Fidelity Simulation and Hierarchical Benchmark for Scientific Embodied Agents

Add code
May 28, 2025
Viaarxiv icon

GLEAM: Learning Generalizable Exploration Policy for Active Mapping in Complex 3D Indoor Scenes

Add code
May 26, 2025
Viaarxiv icon

NavDP: Learning Sim-to-Real Navigation Diffusion Policy with Privileged Information Guidance

Add code
May 13, 2025
Viaarxiv icon