Picture for Dinh Bach Vu

Dinh Bach Vu

Speechless: Speech Instruction Training Without Speech for Low Resource Languages

Add code
May 23, 2025
Viaarxiv icon

AlphaSpace: Enabling Robotic Actions through Semantic Tokenization and Symbolic Reasoning

Add code
Mar 27, 2025
Viaarxiv icon

PoseLess: Depth-Free Vision-to-Joint Control via Direct Image Mapping with VLM

Add code
Mar 11, 2025
Viaarxiv icon

AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO

Add code
Feb 21, 2025
Figure 1 for AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO
Figure 2 for AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO
Figure 3 for AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO
Figure 4 for AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO
Viaarxiv icon

Ichigo: Mixed-Modal Early-Fusion Realtime Voice Assistant

Add code
Oct 20, 2024
Figure 1 for Ichigo: Mixed-Modal Early-Fusion Realtime Voice Assistant
Figure 2 for Ichigo: Mixed-Modal Early-Fusion Realtime Voice Assistant
Figure 3 for Ichigo: Mixed-Modal Early-Fusion Realtime Voice Assistant
Figure 4 for Ichigo: Mixed-Modal Early-Fusion Realtime Voice Assistant
Viaarxiv icon