Picture for Yifan Hou

Yifan Hou

Diversity Matters: Revisiting Test-Time Compute in Vision-Language Models

Add code
May 29, 2026
Viaarxiv icon

Unveiling the Visual Counting Bottleneck in Vision-Language Models

Add code
May 28, 2026
Viaarxiv icon

Minimalist Compliance Control

Add code
Mar 01, 2026
Viaarxiv icon

In-the-Wild Compliant Manipulation with UMI-FT

Add code
Jan 15, 2026
Viaarxiv icon

Locomotion Beyond Feet

Add code
Jan 07, 2026
Viaarxiv icon

Chimera: Diagnosing Shortcut Learning in Visual-Language Understanding

Add code
Sep 26, 2025
Viaarxiv icon

Can Vision-Language Models Solve Visual Math Equations?

Add code
Sep 10, 2025
Viaarxiv icon

Vision in Action: Learning Active Perception from Human Demonstrations

Add code
Jun 18, 2025
Viaarxiv icon

DexMachina: Functional Retargeting for Bimanual Dexterous Manipulation

Add code
May 30, 2025
Viaarxiv icon

DexUMI: Using Human Hand as the Universal Manipulation Interface for Dexterous Manipulation

Add code
May 29, 2025
Viaarxiv icon