Picture for Quang Minh Dinh

Quang Minh Dinh

BERSting at the Screams: A Benchmark for Distanced, Emotional and Shouted Speech Recognition

Add code
Apr 30, 2025
Viaarxiv icon

TrafficVLM: A Controllable Visual Language Model for Traffic Video Captioning

Add code
Apr 14, 2024
Figure 1 for TrafficVLM: A Controllable Visual Language Model for Traffic Video Captioning
Figure 2 for TrafficVLM: A Controllable Visual Language Model for Traffic Video Captioning
Figure 3 for TrafficVLM: A Controllable Visual Language Model for Traffic Video Captioning
Figure 4 for TrafficVLM: A Controllable Visual Language Model for Traffic Video Captioning
Viaarxiv icon