Picture for Ngan Le

Ngan Le

Self-Improving VLA Policies: Selected Diffusion Noise for Spurious-Robust Action Smoothing

Add code
Jun 12, 2026
Viaarxiv icon

Dual-State Slot Attention: Decoupling Appearance and Identity for Video Object-Centric Learning

Add code
Jun 10, 2026
Viaarxiv icon

TSA: Temporal Slot Activation for Persistent Object-Centric Video Representation

Add code
Jun 10, 2026
Viaarxiv icon

DRIVESPATIAL: A Benchmark for Spatiotemporal Intelligence in VLMs for Autonomous Driving

Add code
May 22, 2026
Viaarxiv icon

CodeGraphVLP: Code-as-Planner Meets Semantic-Graph State for Non-Markovian Vision-Language-Action Models

Add code
Apr 24, 2026
Viaarxiv icon

SemLT3D: Semantic-Guided Expert Distillation for Camera-only Long-Tailed 3D Object Detection

Add code
Apr 20, 2026
Viaarxiv icon

GazeQwen: Lightweight Gaze-Conditioned LLM Modulation for Streaming Video Understanding

Add code
Mar 26, 2026
Viaarxiv icon

SIGMA: A Physics-Based Benchmark for Gas Chimney Understanding in Seismic Images

Add code
Mar 24, 2026
Viaarxiv icon

DuFal: Dual-Frequency-Aware Learning for High-Fidelity Extremely Sparse-view CBCT Reconstruction

Add code
Jan 21, 2026
Viaarxiv icon

Clutter-Resistant Vision-Language-Action Models through Object-Centric and Geometry Grounding

Add code
Dec 27, 2025
Viaarxiv icon