Picture for Tai Wang

Tai Wang

Unleashing Spatial Reasoning in Multimodal Large Language Models via Textual Representation Guided Reasoning

Add code
Mar 24, 2026
Viaarxiv icon

RoboInter: A Holistic Intermediate Representation Suite Towards Robotic Manipulation

Add code
Feb 10, 2026
Viaarxiv icon

Nimbus: A Unified Embodied Synthetic Data Generation Framework

Add code
Jan 29, 2026
Viaarxiv icon

InternVLA-A1: Unifying Understanding, Generation and Action for Robotic Manipulation

Add code
Jan 05, 2026
Viaarxiv icon

VL-LN Bench: Towards Long-horizon Goal-oriented Navigation with Active Dialogs

Add code
Dec 31, 2025
Viaarxiv icon

LoGoPlanner: Localization Grounded Navigation Policy with Metric-aware Visual Geometry

Add code
Dec 23, 2025
Viaarxiv icon

MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence

Add code
Dec 11, 2025
Viaarxiv icon

Ground Slow, Move Fast: A Dual-System Foundation Model for Generalizable Vision-and-Language Navigation

Add code
Dec 09, 2025
Viaarxiv icon

ChangingGrounding: 3D Visual Grounding in Changing Scenes

Add code
Oct 16, 2025
Viaarxiv icon

VFlowOpt: A Token Pruning Framework for LMMs with Visual Information Flow-Guided Optimization

Add code
Aug 07, 2025
Figure 1 for VFlowOpt: A Token Pruning Framework for LMMs with Visual Information Flow-Guided Optimization
Figure 2 for VFlowOpt: A Token Pruning Framework for LMMs with Visual Information Flow-Guided Optimization
Figure 3 for VFlowOpt: A Token Pruning Framework for LMMs with Visual Information Flow-Guided Optimization
Figure 4 for VFlowOpt: A Token Pruning Framework for LMMs with Visual Information Flow-Guided Optimization
Viaarxiv icon