Picture for Boyi Li

Boyi Li

One-Policy-Fits-All: Geometry-Aware Action Latents for Cross-Embodiment Manipulation

Add code
Mar 15, 2026
Viaarxiv icon

Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing

Add code
Mar 12, 2026
Viaarxiv icon

$V_1$: Unifying Generation and Self-Verification for Parallel Reasoners

Add code
Mar 04, 2026
Viaarxiv icon

DexImit: Learning Bimanual Dexterous Manipulation from Monocular Human Videos

Add code
Feb 10, 2026
Viaarxiv icon

TADS: Task-Aware Data Selection for Multi-Task Multimodal Pre-Training

Add code
Feb 05, 2026
Viaarxiv icon

Toward Cognitive Supersensing in Multimodal Large Language Model

Add code
Feb 02, 2026
Viaarxiv icon

Accelerating Structured Chain-of-Thought in Autonomous Vehicles

Add code
Feb 02, 2026
Viaarxiv icon

Counterfactual VLA: Self-Reflective Vision-Language-Action Model with Adaptive Reasoning

Add code
Dec 30, 2025
Viaarxiv icon

Towards Efficient and Effective Multi-Camera Encoding for End-to-End Driving

Add code
Dec 12, 2025
Viaarxiv icon

FoundationMotion: Auto-Labeling and Reasoning about Spatial Movement in Videos

Add code
Dec 11, 2025
Figure 1 for FoundationMotion: Auto-Labeling and Reasoning about Spatial Movement in Videos
Figure 2 for FoundationMotion: Auto-Labeling and Reasoning about Spatial Movement in Videos
Figure 3 for FoundationMotion: Auto-Labeling and Reasoning about Spatial Movement in Videos
Figure 4 for FoundationMotion: Auto-Labeling and Reasoning about Spatial Movement in Videos
Viaarxiv icon