Picture for Quang Minh Dinh

Quang Minh Dinh

BERSting at the Screams: A Benchmark for Distanced, Emotional and Shouted Speech Recognition

Add code
Apr 30, 2025
Viaarxiv icon

TrafficVLM: A Controllable Visual Language Model for Traffic Video Captioning

Add code
Apr 14, 2024
Viaarxiv icon