Picture for Bin Zhao

Bin Zhao

Hume: Introducing System-2 Thinking in Visual-Language-Action Model

Add code
May 29, 2025
Viaarxiv icon

Cross from Left to Right Brain: Adaptive Text Dreamer for Vision-and-Language Navigation

Add code
May 27, 2025
Viaarxiv icon

Dynamic Manipulation of Deformable Objects in 3D: Simulation, Benchmark and Learning Strategy

Add code
May 23, 2025
Viaarxiv icon

Representation Discrepancy Bridging Method for Remote Sensing Image-Text Retrieval

Add code
May 22, 2025
Viaarxiv icon

AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use

Add code
May 19, 2025
Viaarxiv icon

AerialVG: A Challenging Benchmark for Aerial Visual Grounding by Exploring Positional Relations

Add code
Apr 10, 2025
Viaarxiv icon

Think Small, Act Big: Primitive Prompt Learning for Lifelong Robot Manipulation

Add code
Apr 01, 2025
Viaarxiv icon

MoMa-Kitchen: A 100K+ Benchmark for Affordance-Grounded Last-Mile Navigation in Mobile Manipulation

Add code
Mar 14, 2025
Viaarxiv icon

AgiBot World Colosseo: A Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems

Add code
Mar 09, 2025
Viaarxiv icon

OpenFly: A Versatile Toolchain and Large-scale Benchmark for Aerial Vision-Language Navigation

Add code
Feb 25, 2025
Viaarxiv icon