Picture for Long Dinh

Long Dinh

Finetuning Vision-Language-Action Models Requires Fewer Layers Than You Think

Add code
Jun 18, 2026
Viaarxiv icon

Start Right, Arrive Right: Asynchronous Execution via Initial Noise Selection

Add code
Jun 18, 2026
Viaarxiv icon

EquiVLA: A General Framework for Rotationally Equivariant Vision-Language-Action Models

Add code
Jun 18, 2026
Viaarxiv icon

Learning from Pixels with Expert Observations

Add code
Jul 15, 2023
Figure 1 for Learning from Pixels with Expert Observations
Figure 2 for Learning from Pixels with Expert Observations
Figure 3 for Learning from Pixels with Expert Observations
Figure 4 for Learning from Pixels with Expert Observations
Viaarxiv icon